Inflectional Morphology Analyzer for Sanskrit
نویسندگان
چکیده
The paper describes a Sanskrit morphological analyzer that identifies and analyzes inflected nounforms and verb-forms in any given sandhi-free text. The system which has been developed as java servlet RDBMS can be tested at http://sanskrit.jnu.ac.in (Language Processing Tools > Sanskrit Tinanta Analyzer/Subanta Analyzer) with Sanskrit data as unicode text. Subsequently, the separate systems of subanta and ti anta will be combined into a single system of sentence analysis with karaka interpretation. Currently, the system checks and labels each word as three basic POS categories subanta, ti anta, and avyaya. Thereafter, each subanta is sent for subanta processing based on an example database and a rule database. The verbs are examined based on a database of verb roots and forms as well by reverse morphology based on Pā inian techniques. Future enhancements include plugging in the amarakosha (http://sanskrit.jnu.ac.in/amara) and other noun lexicons with the subanta system. The ti anta will be enhanced by the k danta analysis module being developed separately.
منابع مشابه
Hindi Derivational Morphological Analyzer
Hindi is an Indian language which is relatively rich in morphology. A few morphological analyzers of this language have been developed. However, they give only inflectional analysis of the language. In this paper, we present our Hindi derivational morphological analyzer. Our algorithm upgrades an existing inflectional analyzer to a derivational analyzer and primarily achieves two goals. First, ...
متن کاملAutomatic Sanskrit Segmentizer Using Finite State Transducers
In this paper, we propose a novel method for automatic segmentation of a Sanskrit string into different words. The input for our segmentizer is a Sanskrit string either encoded as a Unicode string or as a Roman transliterated string and the output is a set of possible splits with weights associated with each of them. We followed two different approaches to segment a Sanskrit text using sandhi1 ...
متن کاملAmazighe Verbal Inflectional Morphology: A New Approach for Analysis and Generation
Amazighe inflectional morphology poses special challenges to Natural Language Processing (NLP) systems. Its rich morphology and the highly complex word formation process of roots and patterns make NLP tools for Amazighe very challenging. In this paper we present an approach for inflectional morphological analysis and generation for Amazighe verbs. The main motivation for this work is to obtain ...
متن کاملComputational Model to Generate Case-Inflected Forms of masculine Nouns for Word Search in Sanskrit E-Text
The problem of word search in Sanskrit is inseparable from complexities that include those caused by euphonic conjunctions and case-inflections. The case-inflectional forms of a noun normally number 24 owing to the fact that in Sanskrit there are eight cases and three numbers-singular, dual and plural. The traditional method of generating these inflectional forms is rather elaborate owing to th...
متن کاملComputer Analysis of the Turkmen Language Morphology
This paper describes the implementation of a two-level morphological analyzer for the Turkmen Language. Like all Turkic languages, the Turkmen Language is an agglutinative language that has productive inflectional and derivational suffixes. In this work, we implemented a finite-state two-level morphological analyzer for Turkmen Language by using Xerox Finite State Tools.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008